MOD DISCRETE EXPANSIONS 



A. D. BARBOUR, E. KOWALSKI, AND A. NIKEGHBALI 

Abstract. In this paper, we consider approximating expansions for 
the distribution of integer valued random variables, in circumstances in 
which convergence in law cannot be expected. The setting is one in 
which the simplest approximation to the n'th random variable X„ is 
by a particular member R„ of a given family of distributions, whose 
variance increases with n. The basic assumption is that the ratio of the 
characteristic function of Xn and that of Rn converges to a limit in a 
prescribed fashion. Our results cover a number of classical examples in 
probability theory, combinatorics and number theory. 



1. Introduction 

In a remarkable paper, Hwang (1999) considered sequences of non-nega- 
tive integer valued random variables Xn, whose probability generating func- 
tions fxn satisfy 

for all z G C with \z\ < rj > 1, where the function g is analytic, and 
lim^^oo An = oo. Under some extra conditions, he exhibits tight bounds on 
the accuracy of the approximation of the distribution of Xn by a Poisson 
distribution with carefully chosen mean, close to A^- Independently, moti- 
vated by specific examples arising in Random Matrix Theory and number 
theory, Jacod, Kowalski and Nikeghbali (2008) explored the properties of a 
related ratio convergence for real valued random variables, namely when the 
characteristic functions (f)x„ satisfy 

locally uniformly in 9 (in particular, bounds on the error in the approxima- 
tion of the distribution Px„ by the normal distribution M{f3n,'jn) can be 
simply deduced). Kowalski and Nikeghbali (2009) went on to explore some 
consequences (and structural aspects in arithmetic cases) of the correspond- 
ing uniform limit 

(1.1) exp{Xnil-e'')UxAd)^m, O<|0|<7r, 
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for random variables Xn (usually integer valued), as in Hwang (1999) with 
the Poisson characteristic function in the ratio. Note that the conditions on 
the distributions of the are now much weaker than those of Hwang (1999): 
for instance, his conditions require the Xn to take only non-negative values, 
and to have exponential tails. On the other hand, the probabilistic results 
that Hwang derives are much more sophisticated. He establishes bounds 
on the error in his approximations with respect to a number of different 
metrics, and shows that they are sharp. For instance, for the Kolmogorov 
and total variation distances, his bounds are typically of order 0(A~^), and 
he also gives the value of the leading asymptotic term in the error. 

In this paper, we work with integer valued random variables, and with 
characteristic function conditions that sharpen (jl.ip . with the aim of de- 
veloping approximations of higher order. Our main result. Proposition 12.11 
is very simple and explicit. This enables us to dispense with asymptotic 
settings, and to prove concrete error bounds. As a direct consequence, we 
are able to deduce a Poisson-Charlier approximation with error of order 
0{\'J^^'^/\ for any prescribed r, assuming that Hwang's conditions hold. 
Our Poisson-Charlier expansions are derived under more general conditions, 
in which the X„ may have only a few finite moments. These are established 
in Section [31 and simpler, translated Poisson approximations are considered 
in Section HI 

Hwang (1999) notes that his methods are also applicable to families of 
distributions other than the Poisson family, and gives examples using the 
Bessel family. Our approach allows one to derive expansions based on any 
discrete family of distributions, as shown in Section [Sj provided that their 
characteristic functions satisfy a simple condition, and this without any extra 
effort. Indeed, the main problem is to identify the higher order terms in the 
expansions. These turn out to be simply the higher order differences of the 
basic distribution, leading, for example, to the Charlier polynomial factors 
in the Poisson case. We discuss some examples, to sums of independent 
integer valued random variables, to Hwang's setting and to the Erdos-Kac 
theorem, in Section [6j 

Remark. We recall the motivation behind the terminology (mod-gaussian, 
mod-poisson, and here mod-discrete): the simplest example leading to limits 
like (say) (jl.l|) is when X„ = Pn + Y where Pn has Poisson distribution 
Po (A„) and is independent of y, where ip{^) is the characteristic function 
of Y . Thus the sequence converges to Y "modulo Poisson variables". 

2. The basic estimate 

We frame our approximations in terms of three distances between (signed) 
measures ^ and v on the integers: the point metric 



dloc(/x,i^) := sup|;u{j} 



Kill 



the Kolmogorov distance 



dK{^i,v) := sup|/i{(-oo,j]} 



Z^{(-OOij]}| 
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and the total variation norm 

E 



^^-v\\ := y\^Ji{j]-v{j]\=2sviY>\^i{A)-v{A)\. 



Other metrics could also be treated using our results. Our conditions are 
expressed in terms of characteristic functions, defined, for a finite signed 
measure a on Z, by (t)a{0) := Xljez 1^1 — ^- essence of our 

argument is the following simple result, linking the closeness of the signed 
measures to the closeness of their characteristic functions, when these have 
a common factor involving a 'large' parameter p. 

Proposition 2.1. Let p and v he finite signed measures on TL, with char- 
acteristic functions (j)^ and (^y respectively. Suppose that (p^ = ip^X o-nd 
(t>y = i^uX, where, for some 7i,72,P, * > 0, 

\ij^,{e)-MO)\ < iM and \xm<l2e-P'' for all \9\<7t. 
Then, writing 7 = 7172, there are explicit constants an and a2t such that 

1. sup \p{j} - i^{j}\ < Qu7(pVl)-(*+i'/2. 

2. sup \p{[a,b]} -u{[a,b]}\ < a2t7(p V l)-*/2. 
Proof. For any j G Z, the Fourier inversion formula gives 

(2.1) p{j} - u{j} = e-''%^,{0) - Me))x{e) de, 

from which our assumptions imply directly that 

< ^l\\efeM-pO'}de. 

For p < 1, we thus have 

For /) > 1, it is immediate that 



2p. 

with := 2~^^^^^l'^rrnl here, mt denotes the t-th absolute moment of 
the standard normal distribution. Setting a\i := max{/?ij, this proves 
part 1. The second part is similar, adding ()2.ip over a < j < b, and 
estimating 



\l-e-'^\ 
This gives part 2, with 

a2t ■■= max{2~*/^mt-iv^7i72,7rVt}. 

In particular, the second part bounds the distance between the two measures 
in the Kolmogorov distance. We shall principally be concerned with taking p 
to be the distribution of a random variable X; we allow v to he a signed 
measure largely for reasons of technical convenience. 
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For some applications, a slight weakening of the conditions in Proposi- 
tion 12.11 is useful. The following result is proved in exactly the same way as 
before. 

Proposition 2.2. Let /i and v he finite signed measures on TL, with char- 
acteristic functions (f>^ and (p^ respectively. Suppose that (p^ = ip^x ^i^d 
4>u = i^uX) where, for some 9o,^,e,r], p' > and for positive pairs jnntm, 
1 < m < M , we have 

M 

IV'/.W-V'.WI < ^7m|^r™+e and |xW| < Te'"'', < |e| < ^o; 
Then, with notation as for Proposition \2.1l we have 

M 

1. sup|/i{i} -Kill < ^7m7aii™(pVl)-(*-+i)/2^ai7e + d2??; 



m=l 



2. sup \n{[a,b]} - u{[a,b]}\ 

ao<a<b<bo 
M 

< J2 ^'"7"2t„ {p V ^ _ ^ l)(ai7e + Q2??), 

771=1 

where 

di := (—A-^—); 02 := (l-— )• 
V vr 2,/7rp/ V tt / 



The presence of the factor (bo — oq + 1) in the second bound means that 
a direct bound on the Kolmogorov distance between the signed measures /i 
and v is not immediately visible. The following corollary is however easily 
deduced. 



Corollary 2.3. Under the conditions of Proposition 
dK{p,y) < ii3f(eife^ + |M|{(-oo,a) U (6,oo)} + |i/|{(-(X),a) U (6,cx))}); 

a<.b 

< inf(4^j^ + |^|{(-oo,a)U(6,oo)} + |z^|{(-oo,o)U(6,oo)}), 

a<b 



where 



M 



7m7a2t„ {p V 1) ^'"/^ + (6 - a + l)(ai7e + as/?); 

m=l 

(& - a + 1) I ^ 77n7ait™ (p V l)-(*-+i)/2 + (ai^/e + asr/)! . 

lm=l J 



^ab 

If also n is a probability measure, then 



dK{p,y) < inf(l-z.{[a,6]} + 2e^f + |z.|{(-cx),a)U(6,oo)}). 

a<b 



MOD-DISCRETE EXPANSIONS 5 

Proof. The inequality for the total variation norm is immediate. For the 
Kolmogorov distance, by considering the possible positions of x in relation 
to a < 6, we have 

\n{{-oo,x]} - iy{{-oo,x]}\ 

< sup|/i{(-oo,y]} - i/{(-oo,y]}| + sup \n{[a,y]} - iy{[a,y]}\ 

y<a a<y<b 

+ sup|;u{(6, y]}-u{{b,y]}\ 

y>b 

< |/i|{(-oo, a) U (6, cx))} + |z^|{(-cx), a) U (5, oo)} + 4?. 
If /i is a probability measure, we have 

|/i|{(-oo,a)U(6,oo)} = l-iJ.{[a,b]}<l-„{[a,b]} + e^^\ □ 



Under slightly stronger conditions than those of Proposition 12.11 a much 
neater total variation bound can be deduced. 

Proposition 2.4. Let /i and v he finite signed measures on TL, with char- 
acteristic functions 0^ = ip^x ^.i^d = ipuX respectively, where x(^) := 
726""'^^) for some 72 > 0, and u{0) = 0. Suppose now that u and the dif- 
ference d^i, := tp^ — 'ipu are both twice difjerentiable, that u'{0) = (iJ^iy(O) = 
and that, for some 71, 72, 73 > 0, p > 1 and t >2, 

\d'l,^{0)\ < 7l|^^~^ \u"{0)\ < 73P and u{9) > pO^ for all \e\ < tt. 
Then, writing 7 = 7172, there is a constant 03 := 03(^,73) such that 
\\p-u\\ < a37(p V 1)"*/^. 



Proof. First, the assumptions on d^i, and u give 
(2.2) 

|n'(^)| < -f3p\e\. 

In particular, for \j\ < \y/p\, we can apply part 1 of Proposition 12.11 which 
gives 

(2.3) < ^(r%(pvi)-(*+^)/^ 

For the remaining j, integrating the Fourier inversion formula (j2.ip twice by 
parts gives 

(2.4) /.{i} - Kj} = f^e~^''{d%{B) - 2d'^,{e)n' {9)+ 

d,A0){{um^-u"{e)})xie)d9. 
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Substituting the bounds from (j2.2p into (|2.4p gives 

\f^{j}-Hj}\ 
< 47/33(^,73)/>-(*+^)/^ 

after some calculation, where, with nit as in Proposition I2.H 

/33(t,73) := ^^^;-^ {4t + (2t + 1)73 + (t + Ihl}. 

H6nc6 

iii>rv^i 

and the proposition follows directly, with a3(t, 73) := 2/33(^,73) + 7^^- CH 



3. Poisson-Charlier expansions 

Suppose first that X is an integer valued random variable having char- 
acteristic function (px of the form (pxiG) = 4'{0)p\{0)i \0\ < tt, where p\ 
denotes the characteristic function of the Poisson distribution Po (A) with 
mean A. Underlying our considerations is an unspecified asymptotic setting 
in which A is large and -0 is thought of as (almost) fixed, but we do not need 
to make direct use of this. We now assume in addition that, for some r G No 
and for some Kj.^ > 0, < < 1, 

(3.1) < K,,5\eY+\ 1^1 <7r, 
where 

r 

(3.2) MO) ■■= E"'(^^)' 

1=0 

is a polynomial of degree r with real coefficients o/, thus implying that 
oo = 1. If is itself the characteristic function of a probability measure, 
this assumption roughly corresponds to assuming that the measure has (at 
least) r finite moments. Alternatively, we could assume that 

(3.3) \i^{e)-MO)\ < i^r5l^r+^ 1^1 

where 

r 

(3.4) MO) := E«'(^''-l)'' 

again with real coefficients a/ and oq = 1. If r = 0, and thus V'o(^) = 1 for 
all 9, we could now immediately use (|3.ip in conjunction with Proposition l2.1l 
to approximate the distribution of X by the Poisson distribution Po (A), with 
an error in Kolmogorov distance of order X~^^'^; note that 

(3.5) \px{e)\ = exp{-A(l -cos^)} < e^P^\ \0\<Tr, 
with p := 27r^^A. 
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We now want to go further, and use (j3.ip with higher values of r to justify 
more sophisticated approximations with a higher order of accuracy. In order 
to do so, we need to find 'nice' signed measures Vr, whose characteristic 
functions are at least as close to il)r{0)p\{P) and '4'r{G)p\iO) as ip{9)px{9) 
is. Now i^r{9)px{0) is itself the characteristic function of a signed measure, 
which we can then take as our choice of u^. To identify Vr, observe that, 
if 4>n is the characteristic function of a signed measure /i, then {e'^-l) (l)^{9) 
is the characteristic function of the Tth difference A'/i of fi: 

AVW := j;(!^)(-i)V{j-^ + fc}. 

fc=o ^ ^ 

For the Poisson distribution, this yields the Poisson-Charlier signed mea- 
sures: 

r 

(3.6) M())pxie) = Y^diie^'-iYpxie) 

1=0 

is the characteristic function of the signed measure u = fr.(A; oi, . . . , Ur) on 
No defined by 

r 

(3.7) u{j} := Po(A){j}{l + J](-l)'azQ(j;A)}, 

1=1 

where 

(3.8) Q(j;A) := ^(-1)' Q f'^) 

k=Q V / V / 

denotes the l-th Charlier polynomial (Chihara 1978, (1.9), p. 171). 

Note that, if (^) is replaced by /k\ in ()3.8p . one obtains the binomial 
expansion of (1 — j'/A)'. As this suggests, the values of Q(j; A) are in fact 
small for j near A if A is large: 

(3.9) |Q(i;A)| < 2'-H|1-j7A|' + (VVA)'} 

(Barbour & Cekanavicius 2002, Lemma 6.1). (j3.9|) thus implies that the 
l-th term in the sum in (|3.7p has total variation norm at most Io^IqA"^/^, 
for a universal constant c;. It also implies that, in any interval of the form 
\j — A| < c\/A, which is where the probability mass of Po (A) is mostly to be 
found, the correction to the Poisson measure Po (A) is of uniform relative 
order 0(A~^/^). Indeed, the Chernoff inequalities for Z ~ Po (A) can be 
expressed in the form 

(3.10) max{P[Z> A(1 + (5)],P[Z< A(l-5)]} < exp{-A5V2(l + (5/3)}, 
for < (5 < 1 (Chung & Lu 2006, Theorem 3.2). Since also, from ([3:8]) . 

|C(j;A)| < (1+j/A)' < 2' if 0<j<A, 

and since 
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if < k < I and j > I + X, it follows that, for any I > 0, we have 

771 

^|Q(j;A)|Po(A){i}<2'P[Z<m] < 2'exp{-(A-m)V3A} 

j=0 

for m < A, and 

^ |C;(j;A)|Po(A){i} <2'P[Z>m-/] < 2' exp{-(m - / - A)V3A}, 

j>m 

for X + l <m <2X + l. 

Writing to denote the absolute measure associated with v, it thus 
follows that 

|i/|{[0,m]} < Ae~(^^"^)'/3A^ 0<m<A; 

(3.11) \u\{[m,oo)} < ^e-('"-"-^)'/3A^ A + r < m < 2A, 

where := 1 + "^2^=1 demonstrating concentration of measure for u 

on a scale of \/A around A. Moreover, it can be deduced from (|3.9p that 
there exists a positive constant d = d{di, . . . ,dr) such that > for 

\j -X\< dX, and it follows from (IXTTD that : \j - X\ > dX} = 0(6""^) 
for some q > 0. Since also z^{No} = 1, it thus follows that, even if i/ is 
formally a signed measure, it differs from a probability only on a set of 
measure exponentially small with A. 

Thus, if (|3.3|) holds, it follows that X has characteristic function 'ip{6)px{9) 
and u := t'r(A;ai, . . . ,dr) has characteristic function (f)^, = il'r{P)p\{0), and 
that the conditions of Proposition 12.11 are satisfied with ^ = Px, t = r + 6, 
K = Kj-s and p = 27r~^A, this last from (j3.5p . If, instead, we are given the 
inequality (j3.ip . we can write e*^ — 1 = iOJ2s>oi'''^)'^/(.^ + ^V-^ ^^'^ equate 
the coefficients of {iOy in ()3.2p with those for 1 < j < r in ()3.6p . giving 
di, . . . ,dr implicitly in terms of ai, . . . , o,.: 

(3.12) a, =±d, UjZT 

1=1 (si,.. .,si)eSj.it= 

where 5m := {{si, . . . ,si): Ylt=i = "i}- With this choice of oi, . . . , Or, it 
follows that v = I'riX; di, . . . , dr) has characteristic function (pi, satisfying 

(3.13) \M0) - M0)\ < rrW+\ \e\<7r, 

for Tr := Tr{ai, . . . , a^). Hence, in this case, we obtain 

(3.14) m9)-(t>um < {KrS + Grs)\9r^, \9\<7T, 

with GrS ■= rrTT^"*^! and the conditions of Proposition 12.11 are satisfied 
with /i = Px, t = r + 5, ^ = Kj-s + G^s and p = 27r~^A. Thus, if either 
(j3.ip or (|3.3p is satisfied, a signed measure from the family ^^(A; 6i, . . . , hr) 
can be found, which approximates the probability measure Px in the sense 
implied by Proposition 12.11 These measures are themselves rather explicit 
perturbations of the Poisson distribution Po (A). 

We summarize these considerations in the following theorem, which is 
deduced directly from Proposition 12. 1[ Note that we shall later be primarily 
concerned with applications in which K^-s and Gj-s are not small, and in 
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which therefore A must be big, if our bounds are to be useful. However, for 
the sake of completeness, we phrase our bounds in a form which also allows 
for accuracy of approximation if Kj.s + GrS is small. 

Theorem 3.1. Let X be a random variable on Z with distribution Px- Sup- 
pose that its characteristic function (j)x is of the form tppx, where px{0) is 
the characteristic function of the Poisson distribution Po (A) with mean A. 
Suppose also that 113. 1\) is satisfied, for some r S No and S > 0. Let 
Vr '■= = fr.(A; oi, . . . , Or) he as in \3. 7\ ), with di,. . . , dr given implicitly 
by 13.12\) . Then, writing t = r + 6, we have 

1- dioc{Px,l^r) ■■= sup \Px{j} - iyr{j}\ 

< a;t(^r5 + G,5)(AVl)-(*+i)/2; 

2. dKiPx,l^r) ■■= SUp|Px{(-00,/]} -Z.,{[0,/]}| 

< a'2t(K.5 + G,5)(AVl)-*/2^ 

with 

^1 . ^ ijl /o\(t+l)/2 I ^ iJl /o^^/2 

Gr8 ■= r(ai, . . . ,a^.)7r^"^. 
If i3. 1\) is replaced by 13. 3\) . the corresponding bounds hold with G^s = 0. 

Theorem 13.11 enables one to deduce simple bounds for other measures of 
the distance between Px and v. For instance, for the total variation norm, 
with judicious choice of mi and 1712, we can use part 1 to bound 

m2-l 

(3.15) Yl \Px{j}-i^{j}\ < (m2-mi-l)sup|Px{j}-Kj}l, 

and then (j3.1ip and part 2 to take care of the remaining tail probabilities: 

(3.16) Yl < Px{(-oo,mi]} + |z^|{[0,mi]} 

j<mi 

< sup|Px{(-oo,/]}-K[0,Z]}| +2|z^|{[0,mi]}, 

zeNo 

and 

(3.17) Yl < Px{[m2, 00)} + \iy\{[m2, oo)} 

j>m2 

< sup|Px{(-oo,/]}-z.{[0,/]}| +2|i/|{[m2,cx))}. 

/eNo 

This gives the following theorem. 

Theorem 3.2. Suppose that the conditions of Theorem lg.il are satisfied, 
with Ili3.14\ ) holding. If K^s + GrS < 1, there is a constant such that 

(3.18) \\Px-H < a4t(i^r5 + G,5)(AVl)-*/' 



max{l, ^/\\og{KrS + Grs)\, Vlog(A + l)}; 
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if Kj-s + Gj-s > 1 CLnd X^'^'^^)/'^ > K^s + GrS, then there is a constant a^t such 
that 

(3.19) WPx-i^W < a^t{Kr5 + G,5)A-*/2 max{l, Vlog(A + 1)}. 

Proof. For K^^ + G^.^ < 1 and A > 1, we use both parts of (j3.1ip . with 

mi := [A - c,.A\/Alog(A + 1)J and m2 := [A + r + Cr-A\/A log(A + 1)], 

where [x\ < x < [x] denote the integers closest to x, obtaining 

|z.|{[0,mi]U[m2,oo)} < 2Br{X+iy'rx/3 < 2Br{KrS + Gr5){X+l)-^'^^^^^ , 

ifc2^ := 3(r + l)/2 + |log(i^^5 + Gr5)|/log(A + l). Hence, from ([XTHjl - dHlTl) . 
it follows that 

\\Px-i^\\ < {2crx^/XhixTT)+r + 2}a[t{Kr5 + Gr5)X-^'+^'>^^ 

+ 2a'2t{Kr5 + G,5)A-*/2 + ABriKrS + Gr5)X~^^''+^^^^ 

SO that 

\\Px-l^\\ < l3u{Kr5 + Gr5)X~'l^ 

max{l, ^\og{l/{Kr& + Grs)), x/log(A + l)}, 

with /33t := au{V6(r + 1) + r + 4} + 2a'2t + 45^- 

For + < 1 and A < 1, we take m2 := |"A+r + y^3| \og{Ky.5 + G^^)] | 
in ()3.1ip . giving 

|z^|{[m2,00)} < Br{KrS + GrS), 

and 

\\Px-y\\ < (r + 2 + V3| log(K.5 + G,5)|)a;j(if,,5 + G^s) 

so that 

\\Px-iy\\ < PstiKrS + Grs) max{l, Vl log(i^.5 + Vlog(A + 1)}, 

with := a'ij{r + 4} + 02* + 25,.. Then (IXTB|) follows, with 044 : = 

For A*/^ > Krs + > 1, we take nii := [A — Cr.-y/Alog(A + 1)J and 
7712 := [A + r + Cr,Y^Alog(A + 1) | , with c,. := \/3t/2, giving 

|z^|{[0,mi] U [m2,oo)} < 2S,.(A + 1)^*/^ 
Using (fXT5|) and (l3T7|) . it follows that 

lli^x-i^ll < {2c,7Abg(ATT) + r + 2Ki(K,5 + G,5)A-(*+^)/' 
+ 2a'2t(ir,5 + G,5)A-*/2 ^ 45^^-t/2 

< a5t(i^r5 + G^5)A-*/2inax{l,7bi(ATl)}, 

with a5t = a'ij(V6(r + 1) + r + 2) + 2a'2i + 4B,. 

Note that if KrS + G-^^ > 1 and A*/^ < + GrS, one cannot hope to get 
a useful bound from Theorem 13.11 If A > 1, the error bound for the indi- 
vidual probabilities is then at least of size A~^/^, which is the same size as 
many of the probabilities themselves. If A < 1, the bound on the individual 
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probabilities is of size comparable to 1. 
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Remark. Under the extra conditions that -0 is twice differentiable and 
that either (j3.ip or (j3.3p holds with r > 2, Proposition 12.41 shows that the 
factor Y^log(A + 1) can in fact be dispensed with. Note that, to satisfy the 
conditions of the proposition, it is necessary to take x{G) '■= exp{e'^ — 1 — i^}, 
to get n'(0) = 0. 

Sometimes it is convenient, for simplicity, to use parameters in the expan- 
sions that are not those emerging naturally from the proofs. The following 
theorem shows that such alterations can easily be allowed for. 

Theorem 3.3. Suppose that 

4>ti ■■= Px^; 4>uW ■■= PxA'; 0^(2) := px'A, 

with A{9) := 1 + ^'(^) ■= 1 + ELi ^[9^ and with A > A'. Then, 

with p := 27r^^A, p' := 2tt~'^X' and cq := 1, 



1=1 

r 

dKifi,!^^'^) < ^a2z|a;-a;i(pVl)-'/2; 
1=1 

r+1 

dioc{f^,u^^^) < (A-A') j;aiHa,-i|(p'vl)-('+i)/2. 

1=1 

r+1 

dK{p,u^^^) < {X-X')J2<^2i\ai-i\{p'Vl)-'/\ 

1=1 

Proof. For the comparison between p and we have 



r 



\A{e)-A'{e)\ < ^|a; -a^l 1^1', < |0| < vr, 
1=1 

and Proposition 12.21 completes the proof. For that between p and i/^^^, note 
that px = p\-x'P\'j and that, for A > A' and < \0\ < n, 

\p,.y{e)-l\\A{9)\ < (A-A')|0| jl + ^lazll^rj, 

from which and Proposition 12.21 the remaining results follow. □ 



4. POISSON APPROXIMATION 

The measures Vr considered above are very explicit. Nevertheless, it is 
even neater to have approximation in terms of a Poisson distribution, where 
possible. Clearly, if (jS.ip holds for any r = ro, 6 = Sq, then it holds with r = 
and V'o(^) = 1 for all 9, with the exponent r + 6 replaced by 5o if ro = and 
by 1 if rg > 1, with Kq depending on and on oi, . . . , Or-g. Theorem 13.11 
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then gives approximation by Po (A) with accuracy in Kolmogorov distance 
of order 0(A~*°/^), for to = min{l,ro + 5o}- 
However, if ro > 1, one can also write 

for any A' > 0, where 

^{9) := V'(0)exp{(A-A')(e^^-l)}. 
Taking A' — A = ai now gives a bound 

of the form (j3.ip . with ti = min{ro + 5o; 2}. Hence, Theorem 13.11 impHes the 
following approximation. 

Corollary 4.1. If X has characteristic function (pxi^) = ipi9)p\{0) such 
that i3.1\) is satisfied with r > 1, then we have 

1. dioc(Px,Po(A')) < ait7(p Vl)-(*+i)/2; 

2. dK(Px,Po(A')) < a2i7(p' V 
where A' = A + ai, t := min{2, r + 5} and p' = 27r~^A'. 

The parameter A' is chosen to make the Poisson mean A' equal to the mean 
A + oi of X. This choice of the Poisson parameter improves the rate, in the 
asymptotic sense that, if oi, . . . ,a,. and KrS remain bounded but A — > oo, 
and r + S > 2, then the approximation error for Kolmogorov distance is of 
order 0(A~^), as opposed to the rate of order 0(A^^/^) in general obtained 
when approximating by Po (A). 

Analogously, fitting the second moment as well (if it is finite) can lead 
to further improvement. If Poisson approximation is still the aim, the eas- 
iest way to proceed is to consider translating the random variable X, and 
approximating X — mhy a Poisson instead: now one would wish to fix 

A' = VarX = EX - m. 

This works well if (Var X — EX) = 0, where (x) denotes the fractional part 
of X, but fails otherwise, since X — m only remains integer valued if m is 
itself an integer. For general X, we therefore use an average of two adjacent 
Poisson probabilities to approximate Px{j}- The details are as follows. 
Suppose that (|3.ip is satisfied with r > 2: (pxi^) = ip{9)px{6) with 

\m-MO)\ < Krs\er\ 

where tpr{0) = Z^j=o%(^^)"'- -^o^' m £ Z and < p < I, define the probabil- 
ity measure Qwmp by 

(4.1) Qx'mpij} := pPo (A'){j - m - 1} + (1 - p)Fo (A'){j - m}, 
having characteristic function qymp given by 

(4.2) qy^piO) := e'-^'{l + p{e'' - l))py{6). 

For p = 0, Q has the distribution of Z' + m, where Z' ~ Po (A'); for p = 1, 
Q has the distribution Z' + m + \\ for Q < p < \^ Q \s a. mixture of these 
two distributions. Thus the family of distributions Qymp can be interpreted 
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as a natural generalization of the usual translated Poisson family, in which 
the translation is not restricted to the integers, but may take any real value. 
Then we can equivalently write = '*/'(^)QA'mp(^)) with 

(4.3) ^{9) := 7p{e)exp{{X-X'){e'^ -l)-im9}{l+p{e'^ 

In the expansion of ip{6), the coefficients of 9 and 9^ are equal to zero if 

EX = X + ai = m + X'+p and VarX = X + {2a2-al) = X'+p{l-p). 

Then mGZ,0<p<l and A' satisfy these two equations if 

m := [ai - (2a2 -af)J; / := (ai - (2a2 - )); 

A' := X + {2a2-al)-p{l-p), 
and it then follows that 

(4.5) m9) - 1| < j\9\\ 

for suitable choice of 7 depending on ai . . . and Krs, with t = min{3, r+5}. 
Since also, from (14. 2D and (13.51). 



iqx'mpm < \px'm < e-p\ 

with p' = 27r~^A', the conditions of Proposition 12.11 are satisfied with x = 
Qx'mp, yielding the following corollary. 

Corollary 4.2. If X has characteristic function (j)xi9) = 'ipi9)px{9) such 
that licl.l]) is satisfied with r > 2, then, for X' , m and p defined as in \4-4^ 
and for t := min{3, r + 5}, we have translated Poisson approximation of the 
form 



2. dK{Px,Qx'mp) < a2tl{p' y I) 



't/2 

where p' = 27r^^A' and 7 is as in ((^.5| j. // \3. j|) is replaced by i3.3\) . then 
one takes ai := ai and 02 := ^2 + fOi i'n- to determine A', m and p. 



In particular, if ai,...,ar and Kj.5 remain bounded but A 00, and if 
r + 5 > 3, then t = 3 and the order of approximation in Kolmogorov distance 
is of order 0(A~^/^). 

5. More general expansions 

We now consider cases in which the role of the Poisson family Po (A) is 
replaced by that of another family of probability distributions Rx, X > 1, on 
the integers. We shall assume that, for Zx ~ Rx, p{X) := E,Zx and o"^(A) := 
VarZ^ exist, and are both continuous functions of A, with (t^(A) increasing 
to infinity with A. Suppose also that there exist c > and h{X) such that 

where rx is the characteristic function of Rx- Clearly, if (|5.1|) is satisfied, 
one could take 
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and c = 1, or else maybe h{\) := ct^(A) with c to be determined, but it may 
also be more convenient to choose some other, simpler form. Then, much 
as in Section [3l we can consider approximating the distribution of a random 
variable X with characteristic function (j)x '■= ip{0)r\{6) by that of a signed 
measure Vr = i't{R\'-, oi, . . . , a^) with characteristic function 

r 

4>uA0) ■■= MO)rx{e) := ^di{e'' -lYrxie), 

1=0 

(as usual, do = 1). As in the Poisson case, is just the linear combina- 
tion X][=o ^'-^^-^^ differences D'^Rx of the probability measure Rx- 
Approximation of the characteristic functions could be expressed either as 

(5.2) 1^(0) _ ^^(0)1 < KrsW^^ \e\<7r, 
for real coefficients a/ and for r S No, < 5 < 1, or as 

(5.3) \^i;{e)-Md)\ < Kr5\er+\ \e\<7T, 

where ipr{S) is as in ()3.2p . in which case the corresponding coefficients di can 
be deduced from ()3.12p . These considerations lead to the following theorem, 
following directly from Proposition 12.11 

Theorem 5.1. Let X he a random variable on % with distribution Px- 
Suppose that its characteristic function cpx is of the form ipRx, where Rx is 
as above. Suppose also that i5.^) is satisfied, for some r G No and 6 > 0. 
Then, writing t = r + 6, we have 

1. dioc(Px,i^r) < auK,5(pVl)-(*+^)/2. 

2. dK{Px,l^r) < a2tKrs{pVl)-'/^, 

with p := ch{\), an and a2t as in Proposition \2. 1{ and 

Vt = Vr{Rx\ Oi, . . . , dr). 

If 15. ^) is replaced by 15. 3\) . the corresponding bounds hold with K^s replaced 
by Kj-s + GrS, with GrS '■= TrTT^'^ and P,. as in I13.13\) . 

As in Section [H one may prefer to approximate with a suitably translated 
member of the family {Rx, A > 1}, rather than with a signed measure Vr- 
The corresponding family of distributions Qmp{R\), for m G Z and < p < 
1, is given by 

(5.4) Q^mp{Rx){j} := pRx{j - m - 1} + (1 - p)Rx{j - m}, 
having characteristic function q^p^ given by 

(5.5) g(^/)(0) := e*-^(l + p(e*'' - l))r,(^). 

Once again, the trick is to find A', m and p so that the mean and variance 
of X and of the distribution Qmp{Rx) are matched. 

If (j5.3p is satisfied with r > 2, matching mean and variance implies that 
we need 



EX = /i(A)+ai = m + p{\')+p; 
(5.6) VarX = (j2(A) + (2a2 - a?) = ^^(A') + p(l - p). 



MOD-DISCRETE EXPANSIONS 



15 



where the coefficients ai and 02 are as in (j3.2p . These equations have a 
solution, as long as VarX > (t^(1) + 1/4, obtained as follows. For < p < 1, 
let X{p) be defined to be the solution of the equation cr^(A(j))) = VarX — 
pil-p), noting that A(0) = A(l). Choose m* := [EX - //(A(0))J . Then the 
continuous function 

/(p) := EX - f,{X{p)) - m* - p 

satisfies /(O) > > /(I), so that there exists a p* such that f{p*) = 0. 
Then the choice A' = A(0), m* and p* satisfies (|5.6p . as desired. 

Corollary 5.2. If X has characteristic function (pxiG) = V'(^)^a(^) o,nd 
if h5. 3|) is satisfied with r > 2, then, for A', m and p solving i5. 6\) and for 
t := min{3, r + 6}, we have translated Rx- approximation of the form 

1. diUPx,Qmp) < aia(p'vl)^(*+^)/2; 

2. dK{Px,Qmp) < a2t7(/o' V 
where p' = ch{\') and for suitable choice ofj. 

The most natural application of the above theorem is to mod-compound 
Poisson approximation. For A > and for fi a probability distribution on Z, 
let CP (A, /u) denote the distribution of the sum Y := J2jez\{o} where 
Zj, j ^ 0, are independent, and Zj ~ Po(A^j). Then, if fii > 0, the 
characteristic function of Y is of the form Rx := CxPXu where (^x is the char- 
acteristic function of Yljez\{o ^^'^ -^1 ~ -^/^i- Thus, for the purposes 
of applying Theorem 15.11 and Corollary 15.21 p can be taken to be 27r^^Ai. 
Corollary 15.21 for instance, then gives conditions under which translated 
compound Poisson distribution can be achieved, with approximation at rate 

0(A-3/2). 

These considerations apply as long as pi > 0, and could also be invoked 
if P-i > 0. If pi = p-i = 0, there is then no factor of the form px 
to guarantee that, for some p > 0, the characteristic function (py of Y 
(corresponding to the characteristic function x of Proposition 12. ip satisfies 
|?5'y(^)| < exp{—p9^} for all \9\ < ir. Some additional aperiodicity condition 
needs to be satisfied, if the family {CF{X,p), A > 1} is to satisfy ()5.ip . 
Indeed, if y = 2Z where Z ~ Po (A), and if ~ Be (1/2) is independent 
of Y, it is not true that the distribution of y -|- is close to that of Y in 
total variation, even though |(/)y+vy(6') — 0y(^)| < -f^ol^l- 

6. Applications 

6.1. A single convolution. The most obvious application of the above 
results arises when (px = ipPx and ^p is itself the characteristic function of a 
probability distribution on the integers. In this case, X is the sum of two 
independent random variables, one of them with the Po (A) distribution, and 
the situation is probabilistically very simple. For example, we could take ^jJ 
to be the characteristic function of a random variable Yg with 

mYs =j] = s! s V 
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for some integer s > 1. Calculation shows that Yg has characteristic function 

s—l —i0\l 

m = l + s^ ^ ^ _,(i_e--^)Mog(l-e^^), 

1=1 

and that ()3.3p holds with r = s — 1 and any 5 < 1 if 

Hence, if X = Z + Ys, where Z Po (A) is independent of Yg, then the theo- 
rems in Sections [3] and m can be applied, provided that s is large enough; in 
particular, a translated Poisson approximation can be applied with accuracy 
of order 0(A~'^/^+^) for any e > if s = 3 (in which case X has finite second 
moment), and of order 0(A~^/^) if s > 4. Similar considerations apply to 
the approximation of X = Z — Yg. 

6.2. Sums of independent random variables. Let Xi, . . . ,Xn be inde- 
pendent integer valued random variables, and let Sn denote their sum. In 
contexts in which a central limit approximation to the distribution of Sn 
would be appropriate, the classical Edgeworth expansion (see, e.g., Petrov 
1975, Chapter 5) is unwieldy, because Sn is confined to the integers. As 
an alternative, Barbour and Cekanavicius (2002) give a Poisson-Charlier 
expansion, for Sn 'centered' so that its mean and variance are almost equal, 
with an error bound expressed in the total variation norm. Here, we show 
that such an expansion can be justified by the techniques of this paper. 

Assume that each of the Xj has finite (r -|- 1 -|- (5)'th moment, with r > 1, 
and define 

(6.1) A(^Hw) ■■= i + = «^p(E^|' 

l>2 U=2 'J 

where := k;(S'„) and ki{X) denotes the Tth factorial cumulant of the 
random variable X. Then the approximation that we establish is to the 
Poisson-Charlier signed measure i^r with 

(6.2) Mj} := Po(A){j}|l + ^(-l)'G[^)Q(j;A)|, 

where Lr ■= max{l,3(r — 1)}, and where A := ES'„; i^r has characteristic 
function 

(6.3) := px{e)A('He), 

where 

(6.4) IM(0) := l + ^aS^)(e^^-l)'. 

1=2 

We need two further quantities involving the Xj-. 

n 

(6.5) := \Y,^2{X,) 
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and 

(6.6) Pj := l-dTv{/:iX,),C{Xj+i)). 

Theorem 6.1. Suppose that there are constants Ki, 1 < / < r + 1, such 
that, for each j, 

\Ki{Xj)\ < Ki, 2<l<r + l; E\Xj\''+'^+^ < K{+^+^ . 

Suppose also that pj > po > for all j, and that A > uXq. Then 

dK{jC{Sn),i^r) < G{Ki,...,Kr+i,K^''\po\Xo')n-^''-'+'^/\ 
for a function G that is bounded on compact sets. 

Remark. For asymptotics in n, with triangular arrays of variables, the 
error is of order 0{n~^'''^^^^^^'^) when Aq and pq are bounded away from 
zero, and Ki, . . . , Kr+i and K^^^ remain bounded. The requirements on 
Ao and po can often be achieved by grouping the random variables appro- 
priately, though attention then has to be paid to the consequent changes 
in the Ki. The final condition can always be satisfied with iiT^") < 1, by 
replacing the Xj by translates, where necessary. For more discussion, we 
refer to Barbour and Cekanavicius (2002). The above conditions are de- 
signed to cover sums of independent random variables, each of which has 
non-trivial variance, has uniformly bounded (r + 1 + (5)'th moment, and 
whose distribution overlaps with its unit translate. 

Proof. We check the conditions of Proposition 12.21 First, in view of ()6.6p . 
we can write 

K{e'0^^) = ip^.(e*^ + 1)01^.(0) + (l-p^.)</'2,(^), 
where both (pij and (j)2j are characteristic functions. Hence we have 

|E(e^^^j) I < 1-Pj+ Pj cos{e/2) < 1 - Pjd'^/ATT, < |6l| < vr. 
Hence (p^iO) := E(e*^'^") satisfies 

(6.7) |(/)^(0)| < exp{-npoeVM, < |0| < vr. 

On the other hand, from the additivity of the factorial cumulants, we have 

\^^l{Sn)\ < riKi, 3 < / < r + 1, 
with \K2{Sn)\ < from From M . we thus deduce the bound 

laj'^^l < QnL'/3J , for q = Q(i^("), i^g, . . . , Kr+i), / > 1. Hence 

(6.8) \^uM\ < exp{-2nAoeV7r^}c'nL-^'/3j < exp{-nXo9^ /tt^}c" , 

for c" = c"{K^"'\K3, . . . ,Kr+i), and we can take := Ce~"'^'^o in Proposi- 
tion's for 

p' = min{Ao/vr^,Po/4vr} 

and a suitable C = C{K^'^\ K^, . . . , Kr+i). The choice of 6o we postpone 
for now. 

For \6\ < 6q, we take x(^) •= Pa(^)) and check the approximation of 
0^(0)exp{-A(e^^ - 1)} = E{(l + u;)^"}e-"'^^" 
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by ^^(6*) polynomial in w := e*" — 1. We begin with the inequality 



r+1 I 
W 



1=0 



^ \sir+2)\ , ,^+2 A 2 



(r + 2)! 



(r + 1)! 



w 



r+1 



(r + 2)! 

derived using Taylor's expansion, true for any s G Z and < S < 1, where 
'■= s{s — 1) . . . (s — / + 1). Hence, for each j, we have 

(6.9) \e{{1 + w)^^}-J2^^^^'^^'^^ 

1=0 



II 



■ w 



~1+S\ 



for a universal constant Cr^s- Then, writing 

Qi%{w;X) := exp|^Ac,(X)yA!|, 
and using the differentiation formula in Petrov (1975, p. 170), we have 

r+1 /TA \ \n\r+1 



qSi(^;^.)-E^'ie(^^) 



m|r+2 

< — - sup 



QSi(^;^i; 



2=e'»'-l 



r + 2)! |0'|<0o 

(6.10) <\e\-+^ciKi,...,Kr+i), 

for a suitable function c and for all \9\ < it. Combining these estimates, we 
deduce that, for w = e^^ — 1 and for all |^| < vr, 

(6.11) |E{(l + u;)^^}e-^^^'"-QSi(u;;Xj)| < h\e\''+^+\ 

where ki = ki{Ki, . . . ,Kr+i). 

Now a standard inequality shows that, for Uj := ni=i nr=j+i 
complex xi,yi with yi and — 1| < q, then 



(6.12) 



n-l 



^ol < \uo\ll[{l+es)\^ei. 



I. s=l 



Z=l 

(2) 



Taking Xj := E {{1 + w)^^} e'^^^'^ and yj := QlJ_^{w;Xj), (ISTT|) shows 
that we can take ei := e := /i;i|0|^'''-'^~'''^e^^ for each I, with 



r+1 



M:=exp{^iri//!}, 



1=2 



provided that \9\ < 9q < 1. Choosing ■= ^ then ensures that (1 + £)"" 
is suitably bounded, and ()6.12p yields 



(6.13) 



for k2 = kiiK^^'^Ki, . . .,Kr+i), since 



fr+l 



\Q]^lM,Sn)\ < exp{\K2iSn)\9l/2}expl^nKi\eo\'/U 

{ 1=3 

is bounded for = 'n~^/^, in view of (jG.Sp . 
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The remaining step is to note that, for w = e 
(6.14) 



1, 



miLr+l 

[Lr + Ij! \e'\<eo 



l{z\ Sn) 



where the right hand side is at most k^'nJ'~^\9\^^^^ (\ + n\6\'^) in |^| < n"-*^/^, 
with fca = k^[K^'^\Ki^ . . . ,Kr+i)- Here, we use the facts that |fi:2(5'n)| is 
bounded by K^'^\ and that each Ki{Sn) for I > 3, for which we have only 
the weak bound riKi, occurs associated with the power u;' in the exponent 
of Ql-^iiw; Sn)- Combining this with (|6.13p . we have estabhshed that for 
1^1 < n~^/^, we have 

(6.15) |(/.^(^)exp{-A(e*^-l)}-I(")(0)| < hnlBl'+^+^l + {n\9\'^Y-^), 
where = k^iK^'^^Ki, . . .,Kr+i). This gives 
71 = n/c4, h 



r + 1 + 6, 72 = n'^ki, t2 



7 =, p = 2A/7r^, e = 0, and 

7,1/3 



n 



3r - 1 + 5 

-1/3 



in Proposition [221 together with rj = Ce"*^"'^^ from the earher bounds. Ap- 
plying Corollary 12.31 using the tail properties of the Poisson-Charlier 
measures p. lip , the theorem follows. □ 



A total variation bound of precisely the same order can also be deduced, by 
combining the arguments used for Propositions 12.21 and 12.41 Note that (j)^ 
is twice differentiable, because the Xj all have finite second moments, and 
that, as in Section [3l we need to take x{^) '■= exp{e*^ — 1 — iO}. 

6.3. Analytic combinatorial schemes. An extremely interesting range of 
applications is to be found in the paper of Hwang (1999). His conditions are 
motivated by examples from combinatorics, in which generating functions 
are natural tools. He works in an asymptotic setting, assuming that Xn is 
a random variable whose probability generating function Rn is of the form 

where /i is a non-negative integer, and both g and e„ are analytic in a 
closed disc of radius rj > 1. As n ^ cxd, he assumes that A — > oo and that 
sup^.|^l<^ kn(-2)| < KX~^, uniformly in n. He then proves a number of results 
describing the accuracy of the approximation of Px^-h by Po (A + g'(l)). 
Under his conditions, it is immediate that we can write 

(6.16) g{z) = ^gj{z-iy and en{z) = J2^nj{z-iy 
for \z\ < r] — 1, with 

(6.17) \gj\ < kg{i]-l)~^ and < X-^k,{7] - ly^ 

for all j > 0. Hence X := Xn — h has characteristic function of the form 
tppx, where 
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and hence, for any r G Nq, 

(6.18) < KriW+\ \e\<{r^-l)/2, 

with 'ip defined as in ()3.4p . taking a^"^ = gj + £nj] note that the constant Kj-i 
can indeed be taken to be uniform for all n. Since also g and e„ are both 
uniformly bounded on the unit circle, and since ipn is bounded (uniformly 
in n) for \9\ < vr, it is clear that ()6.18p can be extended to all \9\ < it, albeit 
with a different uniform constant K'^^ , so that (|3.3p holds with 5 = 1 for any 
r G No- Thus Theorems 13.11 and 13.21 can be applied with any choice of r, 
giving progressively more accurate approximations to Px„-h, as far as the 
A-order is concerned, in terms of progressively more complicated perturba- 
tions of the Poisson distribution. These theorems are thus applicable to all 
the examples that Hwang considers, including the numbers of components 
(counted in various ways) in a wide class of logarithmic assemblies, multisets 
and selections. 

For instance, Corollary 14.21 gives an approximation to Px^-h by the mix- 
ture Qx'mp with 

m := [rrin-Vnl; := A' := X + Vn-p{l-p), 

where m„ := ^^(l), Vn := <(1) + 5n(l) - {5'n(l)}^ and gn ■= g + en- 
Hwang's approximation by Po {\A-g'{l)) has asymptotically the same mean 
as ours (and as that of X„ — h), but a variance asymptotically differing by 
K := ^"(l) — {^''(1)}^ (together with an element arising from p{l — p) which 
is not in general asymptotically negligible). As a consequence, Hwang's ap- 
proximation has an error of larger asymptotic order, in which the quantity k 
appears; for instance, for Kolmogorov distance, his Theorem 1 gives an error 
of order 0(A~^), whereas that from Corollarv 14.21 is of order 0(A~^/^). 

Although our Poisson expansion theorems are automatically applicable 
under Hwang's conditions, they also apply to examples that do not satisfy his 
conditions: that of Section [6. II is one such. Conversely, Hwang's Theorem 2, 
which establishes Poisson approximation in the lower tail with good relative 
accuracy, cannot be proved using only our conditions; the conclusion would 
not be true, for instance, for the random variable X — Ys of Section [6.11 

Note also that Hwang examines problems from combinatorial settings 
in which approximation is not by Poisson distributions: he has examples 
concerning the Bessel family, 

B{X){j} := m-'j^^_, jeN, 

for the appropriate choice of L{X). Here, we could apply Corollarv 15.21 to 
obtain slightly sharper approximations than his within the translated Bessel 
family, or Theorem 15.11 to obtain asymptotically more accurate expansions. 

6.4. Prime divisors. The numbers of prime divisors of a positive inte- 
ger n, counted either with (r2(n)) or without {uj{n)) multiplicity, can also 
be treated by these methods, since excellent information is available about 
their generating functions. For our purposes, we use only the shortest ex- 
pansion, taken from Tenenbaum (1995, Theorems II. 6.1 and 6.2). One finds 
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that for A'^^ uniformly distributed on {1, 2, . . . , n} we have 

Eje^enCiV")} = piogiognW {<J>2(e*' - 1) + r/aW} , 
where \r]s{0)\ < C^/logn, s = 1,2, for some constants Ci and C2, and 

running here over prime numbers. These expansions were established and 
used by Renyi and Turan (1958) in their proof of the Erdos-Kac Theorem, 
but they are also sketched by Selberg (1954). We refer to Kowalski and 
Nikeghbali (2009) for the structural interpretation of the two factors in these 
functions (with l/r(l + w) being related to the number of cycles of large 
random permutations). 

Let dis, s = 1,2, denote the Taylor coefficients of the functions ^siw) 
as power series in w (around w = 0, which corresponds to ^ = 0). By 
analyticity, it follows that for any r, we have 



^s{w) - 1 -^disW^ 



r+1 



for suitable constants Crs and for \w\ < 2. Defining the measures fr by 

r 

:= Po(loglogn){i}(l + ^(-l)'az,Cz(j;loglogn)' 

1=1 ' 

this leads to the following conclusion, which is deduced immediately from 
Theorem |3JJ and refines the Erdos-Kac theorem. 



is) 

Theorem 6.2. For the measures Vr defined above, we have 
d\oc{Pu{N^),'^P) < ai,r+iCri(loglogn)~^~'"/2 + aiCi/logn; 
dK{Puj{Nr.),4^^) < a2,r+iCri(loglogn)-("+i)/2^Ciloglogn/logn; 
dioc{PQ{Nr,)^^P) ^ ai^r+iCr2(loglogn)"^~''/^ + aiC2/logn; 
dK{Pn{N^),4^^) < a2,r+iCr2(loglogn)-("+i)/2 + C2loglogn/logn, 
for suitable constants Ci and C2. 



Remark. Note that it follows from Theorem 13.21 that the total variation 
distance is in each case also of order 0|(loglogn)~^'"~^^'*/'^}. This can be 
deduced by applying the theorem to the expansion with one more term, 
and then observing that the extra term has total variation norm of order 
0{(loglogn)-('^+i)/2}, in view of the observation following (j3.9p . Alterna- 
tively, one could use Proposition 12. 4[ As far as we know, total variation 
approximation was first considered in this context by Harper (2009), who 
proved a bound with error of size 1 / (log log n) (for a truncated version of 



22 A. D. BARBOUR, E. KOWALSKI, AND A. NIKEGHBALI 

a;(n), counting only prime divisors of size up to n^/ (^(^°s^°s^)'^y'j ^ and deduced 
explicit bounds in Kolmogorov distance. 

To indicate what this means in concrete terms for number theory readers, 
consider the case of uj{n) for r = 1. Taylor expansion gives 

^^(w) = l + Biw + 0{w^) 

as w ^ 0, where Bi ~ 0.26149721 is the Mertens constant, i.e., the real 
number such that 



^ - = loglogx + 5i + o(l), 



q<x ^ 
q prime 

as X — > +00. 

In view of the remark above, an application of Theorem 16.21 gives 



<: 1|IP ,Mh 



-\{k < n I u;(n) G ^}| - v^^^ {A} 
n 

= ni 

^ log log n 

for any set A of positive integers, where 

= Po(loglog-){j}(l + i5i{l-|^^} 

Higher expansions could be computed in much the same way. 

Alternatively, a more accurate approximation is available from Corol- 
lary 221 while staying within the realm of (translated) Poisson distributions. 

For this, we compute the expansion of <I>i to order 2, obtaining (after 
some calculations) that 

(^i{w) = 1 + Biw + a2w'^ + 0{w'^), as w ^ 0, 

where 



B2 Jl 



1 V 1 

12 2^-2 



q prime 

(use l/r(l ^vS) = 1 + 7tt; + (7^ — IVl)'u? + 0{w^), as well as the Mertens 
identity 

7+ E (- + log(l--)) =i?i, 



1 

q prime 



and expand every term in the Euler product). This corresponds to 
since if = e*^ — 1, and therefore we have (13.11) with 



q prime 

We can then apply Corollary 14.21 to get the translated Poisson approxi- 
mation Qx'mp, with parameters calculated using (|4.4p . With 

^2 I 

x:=Bi- (2a2 - Bf) = — + V ^ ^ 2.0971815, 

6 ^-^ 

q prime 
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this gives 

p = PS 0.31173945; m = 2; 

A' = loglogn + Bi-x-p{l-p) ^ log log n- 2.0502422 
Thus for any positive integer n and any set A of positive integers, we have 

-\{k<n I uj{n) e A}\ - {pFo {X'){A - 3} + (1 - p)Fo {X'){A - 2}} 



where, again, we can use the total variation norm in view of the previous 
remark. Similar results hold for 0(n), where one obtains the following ap- 
proximate values 
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n 




P 



0.5195; m = 0; 
log log n + 0.5152. 
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